Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unconditionally lower std::string's alignment requirement from 16 to 8. #68925

Merged
merged 1 commit into from
Jan 24, 2024

Conversation

EricWF
Copy link
Member

@EricWF EricWF commented Oct 12, 2023

Unconditionally change std::string's alignment to 8.

This change saves memory by providing the allocator more freedom to allocate the most
efficient size class by dropping the alignment requirements for std::string's
pointer from 16 to 8. This changes the output of std::string::max_size, which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value of "max_size()" changes, which in turn changes whether std::bad_alloc or std::length_error is thrown for large allocations.

This change is the child of PR #68807, which enabled the change behind an ABI flag.

@EricWF EricWF added the libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. label Oct 12, 2023
@llvmbot
Copy link
Member

llvmbot commented Oct 12, 2023

@llvm/pr-subscribers-libcxx

Author: Eric (EricWF)

Changes

As requested in #68807


Full diff: https://github.com/llvm/llvm-project/pull/68925.diff

4 Files Affected:

  • (modified) libcxx/docs/ReleaseNotes/18.rst (+6)
  • (modified) libcxx/include/string (+1-1)
  • (added) libcxx/test/libcxx/strings/basic.string/string.capacity/allocation_size.pass.cpp (+31)
  • (modified) libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp (+2-1)
diff --git a/libcxx/docs/ReleaseNotes/18.rst b/libcxx/docs/ReleaseNotes/18.rst
index 5f43d2f2afe22d3..57f0fa1d7df53d9 100644
--- a/libcxx/docs/ReleaseNotes/18.rst
+++ b/libcxx/docs/ReleaseNotes/18.rst
@@ -133,6 +133,12 @@ ABI Affecting Changes
   results in an ABI break, however in practice we expect uses of ``std::projected`` in ABI-sensitive places to be
   extremely rare. Any error resulting from this change should result in a link-time error.
 
+- The internal alignment requirements for heap allocations inside std::string has decreased from 16 to 8.
+  This save memory since string requests fewer additional bytes than it did previously. However, this
+  also changes the return value of std::string::max_length and can cause code compiled against older
+  libc++ versions but linked at runtime to a new version to thrown a different exception
+  when attempting allocations that are too large (std::bad_alloc vs std::length_error).
+
 Build System Changes
 --------------------
 
diff --git a/libcxx/include/string b/libcxx/include/string
index 33e87406a1156a6..4badb55a78944a1 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -1851,7 +1851,7 @@ private:
         _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20
         size_type __align_it(size_type __s) _NOEXCEPT
             {return (__s + (__a-1)) & ~(__a-1);}
-    enum {__alignment = 16};
+    enum { __alignment = 8 };
     static _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX20
     size_type __recommend(size_type __s) _NOEXCEPT
     {
diff --git a/libcxx/test/libcxx/strings/basic.string/string.capacity/allocation_size.pass.cpp b/libcxx/test/libcxx/strings/basic.string/string.capacity/allocation_size.pass.cpp
new file mode 100644
index 000000000000000..6917a2c48128e81
--- /dev/null
+++ b/libcxx/test/libcxx/strings/basic.string/string.capacity/allocation_size.pass.cpp
@@ -0,0 +1,31 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// <string>
+
+// This test demonstrates the smaller allocation sizes when the alignment
+// requirements of std::string are dropped from 16 to 8.
+
+#include <algorithm>
+#include <cassert>
+#include <cstddef>
+#include <string>
+
+#include "test_macros.h"
+
+int main(int, char**) {
+  std::string input_string;
+  input_string.resize(64, 'a');
+
+  // Call a constructor which selects its size using __recommend.
+  std::string test_string(input_string.data());
+  constexpr std::size_t expected_align8_size = 71;
+  // Previously, when the alignment used to be 16 bytes, the expected
+  // capacity was 79.
+  assert(test_string.capacity() == expected_align8_size);
+}
diff --git a/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp b/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp
index 5af9cab0be4e80a..a13b3826d5d564c 100644
--- a/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp
+++ b/libcxx/test/libcxx/strings/basic.string/string.capacity/max_size.pass.cpp
@@ -18,7 +18,8 @@
 #include "test_macros.h"
 
 // alignment of the string heap buffer is hardcoded to 16
-static const std::size_t alignment = 16;
+
+static const std::size_t alignment = 8;
 
 template <class = int>
 TEST_CONSTEXPR_CXX20 void full_size() {

@ldionne
Copy link
Member

ldionne commented Oct 19, 2023

@EricWF Let's make this change!

@EricWF EricWF force-pushed the without-flag branch 2 times, most recently from 46d64e6 to 7523a56 Compare January 22, 2024 23:28
@EricWF EricWF requested a review from ldionne January 22, 2024 23:29
@EricWF EricWF marked this pull request as ready for review January 22, 2024 23:57
@EricWF EricWF requested a review from a team as a code owner January 22, 2024 23:57
Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@ldionne
Copy link
Member

ldionne commented Jan 23, 2024

It looks like this is breaking the backdeployment tests, which check the exact alignment. It should be sufficient to XFAIL them for older macOS deployment targets (anything before and including macOS 14.x which is the latest released version).

@mstorsjo
Copy link
Member

I see an ABI break being mentioned in the commit message, as being noted and agreed to be acceptable. I'm interested to hear about all the aspects of ABI break that was considered and deemed acceptable. In particular, won't this also break the layout of user's data structures? If I have e.g. struct { char c; std::string str; }, this struct will now have a different layout after this change, effectively breaking the ABI of potentially any struct/class that embeds a string?

Copy link
Member

@ldionne ldionne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mstorsjo

I see an ABI break being mentioned in the commit message, as being noted and agreed to be acceptable. I'm interested to hear about all the aspects of ABI break that was considered and deemed acceptable.

This changes the value returned by std::basic_string::max_size(), which could in theory be relied upon, but in practice no sane code would ever rely on that for anything serious. In fact, the output of that function was wrong for 10+ years and we changed it a couple of releases ago. IMO changing the value returned by this function is as much of an ABI break as changing the return value of any other function to e.g. fix a bug -- it's really not much of a break.

In particular, won't this also break the layout of user's data structures? If I have e.g. struct { char c; std::string str; }, this struct will now have a different layout after this change, effectively breaking the ABI of potentially any struct/class that embeds a string?

No, this only affects the alignment of the data allocated by the string in long mode. This doesn't change the layout of the std::string object itself.

If someone can spot that this is an ABI break in other ways, please let us know, but after thinking about it for a bit I can't think of anything.

libcxx/include/__config Show resolved Hide resolved
libcxx/include/__config Show resolved Hide resolved
@mstorsjo
Copy link
Member

@mstorsjo

I see an ABI break being mentioned in the commit message, as being noted and agreed to be acceptable. I'm interested to hear about all the aspects of ABI break that was considered and deemed acceptable.

This changes the value returned by std::basic_string::max_size(), which could in theory be relied upon, but in practice no sane code would ever rely on that for anything serious. In fact, the output of that function was wrong for 10+ years and we changed it a couple of releases ago. IMO changing the value returned by this function is as much of an ABI break as changing the return value of any other function to e.g. fix a bug -- it's really not much of a break.

In particular, won't this also break the layout of user's data structures? If I have e.g. struct { char c; std::string str; }, this struct will now have a different layout after this change, effectively breaking the ABI of potentially any struct/class that embeds a string?

No, this only affects the alignment of the data allocated by the string in long mode. This doesn't change the layout of the std::string object itself.

Oh, ok - thanks, that clears my concern! Yes this sounds perfectly safe to me then.

@EricWF EricWF requested a review from ldionne January 23, 2024 21:31
@EricWF EricWF force-pushed the without-flag branch 2 times, most recently from db120fa to 5f9cb83 Compare January 23, 2024 21:37
    Save memory by providing the allocator more freedom to allocate the most
    efficient size class by dropping the alignment requirements for std::string's
    pointer from 16 to 8. This changes the output of std::string::max_size,
    which makes it ABI breaking.

    That said, the discussion concluded that we don't care about this ABI
    break and would like this change enabled universally.

    The ABI break isn't one of layout or "class size", but rather the value of
    "max_size()" changes, which in turn changes whether std::bad_alloc or
    std::length_error is thrown for large allocations.

    This change is the child of PR llvm#68807, which enabled the change behind an ABI
    flag.
@ldionne ldionne added this to the LLVM 18.0.X Release milestone Jan 23, 2024
@EricWF EricWF merged commit 04ce0ba into llvm:main Jan 24, 2024
44 checks passed
@EricWF EricWF deleted the without-flag branch January 24, 2024 19:52
EricWF added a commit to efcs/llvm-project that referenced this pull request Jan 25, 2024
…8. (llvm#68925)

Unconditionally change std::string's alignment to 8.

This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.

This change is the child of PR llvm#68807, which enabled the change behind
an ABI flag.
tstellar pushed a commit that referenced this pull request Feb 7, 2024
… to 8 (#68925) (#79480)

This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.

This change is the child of PR #68807, which enabled the change behind
an ABI flag.
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
… to 8 (llvm#68925) (llvm#79480)

This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.

This change is the child of PR llvm#68807, which enabled the change behind
an ABI flag.
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
… to 8 (llvm#68925) (llvm#79480)

This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.

This change is the child of PR llvm#68807, which enabled the change behind
an ABI flag.
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
… to 8 (llvm#68925) (llvm#79480)

This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.

This change is the child of PR llvm#68807, which enabled the change behind
an ABI flag.
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
… to 8 (llvm#68925) (llvm#79480)

This change saves memory by providing the allocator more freedom to
allocate the most
efficient size class by dropping the alignment requirements for
std::string's
pointer from 16 to 8. This changes the output of std::string::max_size,
which makes it ABI breaking.

That said, the discussion concluded that we don't care about this ABI
break. and would like this change enabled universally.

The ABI break isn't one of layout or "class size", but rather the value
of "max_size()" changes, which in turn changes whether `std::bad_alloc`
or `std::length_error` is thrown for large allocations.

This change is the child of PR llvm#68807, which enabled the change behind
an ABI flag.
@pointhex pointhex mentioned this pull request May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
libc++ libc++ C++ Standard Library. Not GNU libstdc++. Not libc++abi. performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants